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Abstract 

In this paper, we study the problem of adaptive estimation of the spectral density of a statio- 
nary Gaussian process. For this purpose, we consider a wavelet-based method which combines 
the ideas of wavelet approximation and estimation by information projection in order to warrants 
that the solution is a non-negative function. The spectral density of the process is estimated by 
projecting the wavelet thresholding expansion of the periodogram onto a family of exponential 
functions. This ensures that the spectral density estimator is a strictly positive function. The 
theoretical behavior of the estimator is established in terms of rate of convergence of the Kullback- 
Leibler discrepancy over Besov classes. We also show the excellent practical performance of the 
estimator in some numerical experiments. 

Keywords: Spectral density estimation, adaptive estimation, wavelet thresholding, sequences of expo- 
nential families, Besov spaces. 
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1 Introduction 

The estimation of spectral densities is a fundamental problem in inference for stationary stochastic 
processes. Many applications in several helds such as weather forecast and financial series are deeply 
related to this issue, see for instance Priestley j20j . It is known that the estimation of the covariance 
function of a stationary process is strongly related to the estimation of the corresponding spectral 
density. By Bochner's theorem the covariance function is non-negative definite if and only if the 
corresponding spectral density is a non-negative function. Hence in order to preserve the property 
of non-negative definiteness of a covariance function, the estimation of the corresponding spectral 
density must be a non-negative function. The purpose of this work is to provide a non-negative 
estimator of the spectral density. 

Inference in the spectral domain uses the periodogram of the data, providing an inconsistent 
estimator which must be smoothed in order to achieve consistency. For highly regular spectral 
densities, linear smoothing techniques such as kernel smoothing are appropriate (see Brillinger [1]). 
However, these methods are not able to achieve the optimal mean-square rate of convergence for 
spectra whose smoothness is distributed inhomogeneously over the domain of interest. For this 
nonlinear methods are needed. One nonlinear method for adaptive spectral density estimation of a 



stationary Gaussian sequence was proposed by Comte [7J. It is based on model selection techniques. 
Others nonlinear smoothing procedures are the wavelet thresholding methods, first proposed by 
Donoho and Johnstone |12j . In this context, different thresholding rules have been proposed by 
Neumann |18j and Fryzlewicz, Nason and von Sachs [13] to name but a few. 

Neumann's approach [18] consists in pre-estimating the variance of the periodogram via kernel 
smoothing, so that it can be supplied to the wavelet estimation procedure. Kernel pre-estimation 
may not be appropriate in cases where the underlying spectral density is of low regularity. One 
way to avoid this problem is proposed in Fryzlewicz, Nason and von Sachs [13] . where the empirical 
wavelet coefficient thresholds are built as appropriate local weighted l\ norms of the periodogram. 
These methods do not produce non-negative spectral density estimators, therefore the corresponding 
estimators of the covariance function is not non-negative definite. 

To overcome the drawbacks of previous estimators, in this paper we propose a new wavelet-based 
method for the estimation of the spectral density of a Gaussian process. As a solution to ensure non- 
negativeness of the spectral density estimator, our method combines the ideas of wavelet thresholding 
and estimation by information projection. We estimate the spectral density by a projection of 
the nonlinear wavelet approximation of the periodogram onto a family of exponential functions. 
Therefore, the estimator is non-negative by construction. This technique was studied by Barron and 
Sheu [2] for the approximation of density functions by sequences of exponential families, by Loubes 
and Yan [16] for penalized maximum likelihood estimation with penalty, by Antoniadis and Bigot 
PQ for the study of Poisson inverse problems, and by Bigot and Van Bellegem [5] for log-density 
deconvolution. 

The theoretical optimality of the estimators for the spectral density of a stationary process is 
generally studied using risk bounds in L2-norm. This is the case in the papers of Neumann [18] . 
Comte [7] and Fryzlewicz, Nason and von Sachs |13] mentioned before. In this work, the behavior 
of the proposed estimator is established in terms of the rate of convergence of the Kullback-Leibler 
discrepancy over Besov classes, which is maybe a more natural loss function for the estimation of a 
spectral density function than the L2-norm. Moreover, the thresholding rules that we use to derive 
adaptive estimators differ from previous approaches based on wavelet decomposition and are quite 
simple to compute. Finally, we compare the performance of our estimator with other estimators on 
some simulations. 

The paper is organized as follows. Section 2 presents the statistical framework under which 
we work. We define the model, the wavelet-based exponential family and the linear and nonlinear 
wavelet estimators by information projection. We also recall the definition of the Kullback-Leibler 
divergence and some results on Besov spaces. The rate of convergence of the proposed estimators 
are stated in Section 3. Some numerical experiments are described in Section 4. Technical lemmas 
and proofs of the main theorems are gathered in the Appendix. 

Throughout this paper C denotes a constant that may vary from line to line. The notation C(.) 
specifies the dependency of C on some quantities. 

2 Statistical framework 
2.1 The model 

We aim at providing a nonparametric adaptive estimation of the spectral density which satisfies the 
property of being non-negative in order to guarantee that the covariance estimator is a non-negative 
definite function. We consider the sequence (X t ) teN that satisfies the following assumptions: 
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Assumption 1 The sequence (Xi, ■■■X n ) is an n-sample drawn from a stationary sequence of Gaus- 
sian random variables. 

Let p be the covariance function of the process, i.e. p(h) = cov (Xt, Xt+h) with h € Z. The 
spectral density / is defined as: 

/(*0 = ^£pW^ & ,we[o,i]. 

heZ 

We need the following standard assumption on p: 

Assumption 2 The covariance function p is non-negative definite, such that there exists two con- 
stants < Ci,C% < +oo such that \p(h)\ = C\ and \hp 2 (h)\ = C%. 

hez hez 

Assumption 2 implies in particular that the spectral density / is bounded by the constant C\. 
As a consequence, it is also square integrable. As in Comte [7j, the data consist on a number of 
observations X\, ...,X n at regularly spaced points. We want to obtain a positive estimator for the 
spectral density function / without parametric assumptions on the basis of these observations. For 
this, we combine the ideas of wavelet thresholding and estimation by information projection. 



2.2 Estimation by information projection 

To ensure nonnegativity of the estimator, we will look for approximations over an exponential family. 
For this, we construct a sieve of exponential functions defined in a wavelet basis. 

Let 4>(oS) and ijj(oj), respectively, be the scaling and the wavelet functions generated by an 
orthonormal multiresolution decomposition of L2QO, 1]), see Mallat |T7j for a detailed exposition 
on wavelet analysis. Throughout the paper, the functions (p and ip are supposed to be compactly 
supported and such that H^H^ < +00, Halloo < +°°- Then, for any integer jo > 0, any function 
g € L2 ([0, 1]) has the following representation: 

2-70-1 +00 2J-1 

?(w)= (g, </>j ,k) 4>j ,k H + Yl ^' ^0,k) i>j,k M , 

k=0 3=30 k=0 

where (j>j k ( w ) = 2~2"</> {2P°oj — k) and ipj^ (ui) = 22ip {2 3 oj — k). The main idea of this paper is to 
expand the spectral density / onto this wavelet basis and to find an estimator of this expansion 
that is then modified to impose the positivity property. The scaling and wavelet coefficients of the 
spectral density function / are denoted by a,j 0) k = (f,<f>j 0> k) and bj^ = {f,ipj,k)- 

To simplify the notations, we write (ipj,k)j = j _i f° r the scaling functions (<£j,fc) -_- Q < Let j% > jo 
and define the set 

Ajx = {(J, k) : j - 1 < j < ji, < k < 2> ''- 1} . 

Note that #Aj x = 2? 1 , where #A,j denotes the cardinal of Aj 1 . Let 9 denotes a vector in , the 

wavelet-based exponential family £j 1 at scale j\ is defined as the set of functions: 

% = i/i^(.)=exp( ^(•iL^M^A^R*"^ (2-1) 
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It is well known that Besov spaces for periodic functions in -^([O, 1]) can be characterized in 
terms of wavelet coefficients (see e.g. Mallat [H]). Assume that ip has m vanishing moments, and let 
< s < m denote the usual smoothness parameter. Then, for a Besov ball Bp q {A) of radius A > 
with 1 < p, q < oo, one has that for s* = s + 1/2 — 1/p > 0: 



BiAA) ■■= 



' 2-70-1 



fc=0 



2^-1 



P ( oo 

g G L 2 ([0, 1] : \\g\\ s , p , q := I £ \a JO , k \ p I + £ V** I £ |6,, ^ I I < A 



3=30 \ k=0 




with the respective above sums replaced by maximum if p = oo or q = oo and where Oj ,fc = (g, 0j o ,fc) 
and b j)k = (g,ipj,k)- 

The condition that s + 1/2 — l/p>0is imposed to ensure that Bp q (A) is a subspace of 1]), 
and we shall restrict ourselves to this case in this paper (although not always stated, it is clear that 
all our results hold for s < m). 

Let M > and denote by Fp q (M) the set of functions such that 

Fp, q {M) = {f = exp (g) : \\g\\ s , M < M} , 

where ||5||s, p ,g denotes the norm in the Besov space Bp q . Note that assuming that / G Fp q {M) 
implies that / is strictly positive. The following results hold. 

Lemma 2.1 Suppose that f E F^ q {M) with s > ~ and 1 < p < 2. Then, there exists a constant M\ 
such that for all f G F* q (M), < Aff 1 < / < Mi < +00. 

Let Vj denote the usual multiresolution space at scale j spanned by the scaling functions (4>j,k)o<k<2^i^ 
and define Aj < +00 as the constant such that \\v < Aj \\v\\ L for all v G Vj. For / G Fp q (M), let 

2-?-l 

9 = log (/)■ Then for j > j -l, define Dj = \\g - gj\\ L and 7^ = \\g - gjW^ , where gj = £ Oj,ki>j,k, 

k=0 

with ^ )fc = (g,ipj,k)- 

The proof of the following lemma immediately follows from the arguments in the proof of Lemma 
A. 5 in Antoniadis and Bigot pQ. 

Lemma 2.2 Let j G N. Then Aj < C2^ 2 . Suppose that f G F^ q (M) with 1 < p < 2 and 
s > i T/ien, uniformly over F^ q {M), Dj < C2-^ s+1 / 2 -Vp) and 7,- < C2~^- 1 /p) w/iere C denotes 
constants depending only on M , s, p and q. 

To assess the quality of the estimators, we will measure the discrepancy between an estimator / 
and the true function / in the sense of relative entropy (Kullback-Leibler divergence) defined by: 

A ( /;/ >/( /log (?)- / + / 1*'' 

where [i denotes the Lebesgue measure on [0, 1]. It can be shown that A f \ is non-negative and 

equals zero if and only if / = /. 

We will enforce our estimator of the spectral density to belong to the family £ j 1 of exponential 
functions, which are positive by definition. For this we will consider a notion of projection using 
information projection. 
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The estimation of density function based on information projection has been introduced by Barron 
and Sheu [2j. To apply this method in our context, we recall for completeness a set of results that 
are useful to prove the existence of our estimators. The proofs of the following lemmas immediately 
follow from results in Barron and Sheu [2] and Antoniadis and Bigot [I]. 

Lemma 2.3 Let (3 G M# A n . Assume that there exists some 9 (/3) G M# a ji such that, for all (j, k) G 
Ajj , 9 (J3) is a solution of 

(fj,e(j3),^j,k) = Pj,k- 

Then for any function f such that (f,ipj t k) = Pj,k f or a ll (j>k) G A Ji; and for all 9 G R* Aj i, the 
following Pythagorian-like identity holds: 



A (/; /„„) = A (/; f jm ) + A [f im \f jfi ) . 



(2.2) 



The next lemma is a key result which gives sufficient conditions for the existence of the vector 
9 (/?) as defined in Lemma 12.31 This lemma also relates distances between the functions in the 
exponential family to distances between the corresponding wavelet coefficients. Its proof relies upon 
a series of lemmas on bounds within exponential families for the Kullback-Leibler divergence and 
can be found in Barron and Sheu [2\ and Antoniadis and Bigot pQ. 



Lemma 2.4 Let 9 G R# A n , f3 = (A),(j.*))(,-, fc ) eAi e M#Ajl such that = (fjAv*Pj,k) for 

all (j,k) G Aj 1? and (3 G R# A Ji a given vector. Let b = exp (||log (/j,0 o )lloo) an( ^ e = ex P(l)- V 



< 



i 



2ebA 



n 



then the solution 9 ( f3 ) of 



Uhfi^j.k) = Pj,k for all (j,k) G k h 
exists and satisfies 



P) -Oo 



< 2eb 



log 



< 2ebA h 



o 

2 

P-Po 

2 



P-P 



where \\PW2 denotes the standard Euclidean norm for (3 G R* Aj i . 



Following Csiszar [8j, it is possible to define the projection of a function / onto Ej 1 . If this 
projection exists, it is defined as the function fj lt e* in the exponential family £j 1 that is the closest 
to the true function / in the Kullback-Leibler sense, and is characterized as the unique function in 
the family £j 1 for which 



(fhfij^j,*) = (f^j,k) ■= Pj,k for all (j,k) G A^. 



Note that the notation is used to denote both the the scaling coefficients a JOi fc and the wavelet 
coefficients b~ 
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Let 

1 n n 



27rn 

i=l t'=l 

be the classical periodogram, where (A^ — X)* denotes the conjugate transpose of (X^ — X) and 

n 

X = — ^2 Xf. The expansion of I n (to) onto the wavelet basis allows to obtain estimators of ctj k 
t=l 

and bj t k given by 



l 



J I n (u) <p j(hk (uj) doo and 6j,fc = J In (u) ipj,k {u) duj. (2.3) 



It seems therefore natural to estimate the function / by searching for some 6 n G R* a ji such that 

l 



where j3j^ denotes both the estimation of the scaling coefficients Oj ,fc and the wavelet coefficients 
bjfi. The function f.^ g- is the spectral density positive linear estimator. 

Similarly, the positive nonlinear estimator with hard thresholding is defined as the function f H % 

(with 9 n G R #A ^i ) such that 

f nLe = 6 t (%> k ) for a11 (j ' k) G A * ' (2 - 5) 

where 5^ denotes the hard thresholding rule defined by 

(x) = xl(\x\ > f) for x G M, 

where £ > is an appropriate threshold whose choice is discussed later on. 

The existence of these estimators is questionable. Thus, in the next sections, some sufficient 
conditions are given for the existence of /. s- and / ? with probability tending to one as n — > +oo. 

Even if an explicit expression for 8 n is not available, we use a numerical approximation of 9 n , obtained 
via a gradient-descent algorithm with an adaptive step. 

In this section we establish the rate of convergence of our estimators in terms of the Kullback- 
Leibler discrepancy over Besov classes. 

We make the following assumption on the wavelet basis that guarantees that Assumption [2] holds 
uniformly over Fp q (M). 

Assumption 3 Let M > 0, 1 < p < 2 and s > I /p. For f G F^ q {M) and h G Z, let p{h) = 
Jq f(oj)e~ l2nujh du), C\(f) := ^ \p{h)\ and C2U) ■= Yl \hp 2 (h)\. Then, the wavelet basis is such 

that there exists a constant M* such that for all f G F,^ q {M), 
Ci(/) <M* andC 2 {f)<M*. 
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2.3 Linear estimation 



The following theorem is the general result on the linear information projection estimator of the 
spectral density function. Note that the choice of the coarse level resolution level jo is of minor 
importance, and without loss of generality we take jo = for the linear estimator /. g . 

Theorem 2.5 Assume that f 6 F 22 {M) with s > ^ and suppose that Assumptions El and\^ are 

i 

satisfied. Define j\ = j\(n) as the largest integer such that 2 J1 < n 2s + 1 . Then, with probability 
tending to one as n —> +oo, the information projection estimator \2.J$ exists and satisfies: 

Moreover, the convergence is uniform over the class F 22 iM) ^ n the sense that 



lim lim sup P nwA /;/ ( )? )>K = 0. 

This theorem provides the existence with probability tending to one of a linear estimator for the 
spectral density / given by /. , g- . This estimator is strictly positive by construction. Therefore 

the corresponding estimator of the covariance function p L (which is obtained as the inverse Fourier 
transform of fj 1 ^ g ) is a positive definite function by Bochner's theorem. Hence p L is a covariance 
function. 

In the related problem of density estimation from an i.d.d. sample, Koo |15j has shown that, 

2s 

for the Kullback-Leibler divergence, n 2s + 1 is the fastest rate of convergence for the problem of 
estimating a density / such that log(/) belongs to the space i?| 2 (M). For spectral densities belonging 

' 2s 

to a general Besov ball Bp q (M), Newman [18] has also shown that n 2s + 1 is an optimal rate of 

2s 

convergence for the L 2 risk. For the Kullback-Leibler divergence, we conjecture that n 2s + 1 is the 
minimax rate of convergence for spectral densities belonging to F^iM). 

However, the result obtained in the above theorem is nonadaptive because the selection of j\ (n) 
depends on the unknown smoothness s of /. Moreover, the result is only suited for smooth functions 
(as -^22(^0 corresponds to a Sobolev space of order s) and does not attain an optimal rate of 
convergence when for example g = log(/) has singularities. We therefore propose in the next section 
an adaptive estimator derived by applying an appropriate nonlinear thresholding procedure. 

2.4 Adaptive estimation 
2.4.1 The bound on / is known 

In adaptive estimation, we need to define an appropriate thresholding rule for the wavelet coefficients 
of the periodogram. This threshold is level-dependent and in this paper will take the form 



e = 0, 



3,n 



n n I Wn 



(2.6) 



where 5 > is a tuning parameter whose choice will be discussed later on and C* = y ° 2+ ^ Cl ■ The 
following theorem states that the relative entropy between the true / and its nonlinear estimator 
achieves in probability the conjectured optimal rate of convergence up to a logarithmic factor over a 
wide range of Besov balls. 



7 



Theorem 2.6 Assume that f G F^ q {M) with s > \ + | and 1 < p < 2. Suppose also that 
Assumptions 1, 2, 3 hold. For any n > 1, define jo = jo ( n ) to be the integer such that 2 J0 > logn > 
2- 70-1 , and ji = ji (n) to 6e the integer such that 2- 71 > > 2? 1 . For <5 > 6, tafce i/ie threshold 
^j )TL as in \2. 6\) . Then, the thresholding estimator \2. 5\) exists with probability tending to one when 
n — > +oo and satisfies: 



A ( f ■ f HT 




2s+l 



Note that the choices of jo, ji and £j i7l are independent of the parameter s; hence the estimator 
f HT s , ~ is an adaptive estimator which attains in probability what we claim is the optimal 

jo(n),ji(n),e n £ Jin 

rate of convergence, up to a logarithmic factor. In particular, f H J . , - is adaptive on FS 9 (M) . 

This theorem provides the existence with probability tending to one of a nonlinear estimator for the 
spectral density. This estimator is strictly positive by construction. Therefore the corresponding 
estimator of the covariance function p NL (which is obtained as the inverse Fourier transform of 
f H T, , ^ ^ ) is a positive definite function by Bochner theorem. Hence p NL is a covariance 
function. 



2.4.2 Estimating the bound on / 



Although the results of Theorem 12 . 61 are certainly of some theoretical interest, they are not helpful for 
practical applications. The (deterministic) threshold £j jn depends on the unknown quantities ll/ll^ 
and C* := C (C\,C2), where C\ and C2 are unknown constants. To make the method applicable, 
it is necessary to find some completely data-driven rule for the threshold, which works well over a 
range as wide as possible of smoothness classes. In this subsection, we give an extension that leads 
to consider a random threshold which no longer depends on the bound on / neither on C*. For this 
let us consider the dyadic partitions of [0, 1] given by Z n = { (j/2 Jn , (j + 1) /2 Jn ) , j = 0, . . . , 2 Jn — 1 } . 
Given some positive integer r, we define V n as the space of piecewise polynomials of degree r on the 
dyadic partition X n of step 2~ Jn . The dimension of V n depends on n and is denoted by iV n . Note 
that N n = (r + 1) 2 Jn . This family is regular in the sense that the partition X n has equispaced knots. 

An estimator of || jf || is constructed as proposed by Birge and Massart [6] in the following 
way. We take the infinite norm of f n , where f n denotes the (empirical) orthogonal projection of the 
periodogram I n on V n . We denote by f n the ^-orthogonal projection of / on the same space. Then 
the following theorem holds. 

Theorem 2.7 Assume that f E Fp q (M) with s > \ + \ an d 1 — P — 2- Suppose also that 

jo (n) be the integer such that 2 J0 > log n > 
. Take the constants 5 = 6 and 



Assumptions 1, 2 and 3 hold. For any n > 1, let jo 
2i°- 1 , and let j x 

b G an d define the threshold 



ji (n) be the integer such that 2 n > j^^- > 2 jl 











£,j,n — 2 


2 


fn 













Then, if \\f - f n \ 



< i 
00 — 4 



{i -by 

and N„ < 



log n 



+ 



logn 



n 



(2.7) 



{i -by 

where k is a numerical constant and r is the 



(r+l) 2 logn' 

degree of the polynomials, the thresholding estimator \2. 5|) exists with probability tending to one as 



S 



n — > +00 and satisfies 



±(f:f"'. . t. )=0. 



jo(n),ji(n),d n ^j, n J P \ \ Jo 



n 



: n 



Note that, we finally obtain a fully tractable estimator of / which reaches the optimal rate of 
convergence without prior knowledge of the regularity of the spectral density, but also which gets 
rise to a real covariance estimator. 

Remark 2.8 We point out that, in Comte J7j] the condition \\f — /nlloo — 4 ll/lloo ^ s assumed. Under 
some regularity conditions on f , results from approximation theory entails that this condition is met. 
Indeed for f E B^^, with s > ^ we know from De Vore and Lorentz fTJf that 



with l/l = supy s Wd (/, y) < +00, where w,i (/, y) is the modulus of smoothness and d = [s] + 1. 

Therefore \\f - f n \L < \ ll/IL if N n > (^(s)^) 1 ^ := C(f,s,p), where C(f,s,p) is a 
constant depending on f , s and p. 



3 Numerical experiments 

In this section we present some numerical experiments which support the claims made in the theo- 
retical part of this paper. The programs for our simulations were implemented using the MATLAB 
programming environment. We simulate a time series which is a superposition of an ARMA(2,2) 
process and a Gaussian white noise: 

X t = Y t + c Z t , (3.1) 

where Y t + a{Y t -i + 02^-2 = bo^t + + &2£t-2, and {et}, {Zt} are independent Gaussian white 

noise processes with unit variance. The constants were chosen as a\ = 0.2, 02 = 0.9, 60 = 1> &l = 0, 
62 = 1 and Co = 0.5. We generated a sample of size n = 1024 according to (|3.ip . The spectral density 
/ of (Xt) is shown in Figured! It has two moderately sharp peaks and is smooth in the rest of the 
domain. 

Starting from the periodogram we considered the Symmlet 8 basis, i.e. the least asymmetric, 
compactly supported wavelets which are described in Daubechies [9]. We choose jo and j\ as in the 
hypothesis of Theorem 12.71 and left the coefficients assigned to the father wavelets unthresholded. 
Hard thresholding is performed using the threshold £ Jira as in (12. 7p for the levels j = jo, —,ji, and the 
empirical coefficients from the higher resolution scales j > j\ are set to zero. This gives the estimate 

2W-1 ji 2J-1 

4' ',.£,,. = Yl %o,fc0io,fc +EE W ( bj, h > £j,n) lpj,k, (3.2) 

k=0 j=jo k=0 

which is obtained by simply thresholding the wavelet coefficients (|2.3p of the periodogram. Note that 
such an estimator is not guaranteed to be strictly positive in the interval [0, 1] . However, we use it 
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to built our strictly positive estimator f HT ~ - (see (|2.5p to recall its definition). We want to find 
9 n such that 

(Ci4^' fc ) = f ° r a11 ^ G A - 

For this, we take 

6 n = arg min ^ ((fj ,ji,9,^j,k) ~ 6§ n (Pj,k) ) , 
ee M# A ii (j,fc)eA J:L 

where fj j u $ (•) = exp ^ Oj,kipj,k (•) G £ji and £j a is the family (|2.ip . To solve this opti- 
mization problem we used a gradient descent method with an adaptive step, taking as initial value 

where ( f HT ~ (ui) ) := max I f HT ~ (u) , r\ I for all ui £ [0, 1] and r\ > is a small constant. 

In Figure [Q we display the unconstrained estimator /|f^ c . as in (|3.2p , obtained by thresholding 
of the wavelet coefficients of the periodogram, together with the estimator f HT ~ ~ , which is 

strictly positive by construction. Note that these wavelet estimators capture well the peaks and look 
fairly good on the smooth part too. 

We compared our method with the spectral density estimator proposed by Comte [TJ, which is 
based on a model selection procedure. As an example, in Comte [7J, the author study the behavior of 
such estimators using a collection of nested models (S m ), with m = 1, 100, where S m is the space of 
piecewise constant functions, generated by a histogram basis on [0, 1] of dimension m with equispaced 
knots (see Comte [7] for further details). In Figure [2] we show the result of this comparison. Note 
that our method better captures the peaks of the true spectral density. 
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4 Appendix 

Throughout all the proofs, C denotes a generic constant whose value may change from line to line. 

4.1 Technical results for the empirical estimators of the wavelet coefficients 
Lemma 4.1 Let n > 1, /3jk := (/, ipj t k) and (3j^ := (I n ,ijjj t k) for j > jo — 1 and < k < 2 J — 1. 
Suppose that Assumptions [7], [1 and hold. Then, Bias 2 ((3j t k) '■= (f3j,k) — ftj,k) < % with 
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' 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 



Figure 1: True spectral density /, wavelet thresholding estimator / ~ and final positive estimator 

jv-.i: -si... 



C* = y ° 2+ ^i Cl , and Var ^/3j,feJ := IE [Pj t k — E (fij,kj J < ^ /or some constant C > 0. Moreover, 
there exists a constant M2 > suc/i i/iai for all f E Fp q (M) with s > ^ and 1 < p < 2, 

E (&,fc - ft, fc ) 2 = B»s 2 (ft,*) + V^ar (^-fc) < 

Proof. Note that Bias 2 (^Pj,k) < 11/ — E(/ n )||^ 2 . Using Proposition 1 in Comte [7], Assumptions [U 

and[2]imply that ||/ — E (/n)||^ 2 < — ^-t— 1 , which gives the result for the bias term. To bound the 
variance term, remark that 

Var(p hk ) =E(/ n -E(/ n ),^, fc ) 2 < E||/ n - E (/ n ) ||1 2 ||^, fc ||i 2 = J E| I n (u) - E (I n (o;)) \ 2 du. 

Then, under Assumptions [1] and [21 it follows that there exists an absolute constant C > such 
that for all u G [0, 1], E|/ n (w) - E (I„H) | 2 < To complete the proof it remains to remark that 
Assumption [3] implies that these bounds for the bias and the variance hold uniformly over Fp q {M). 
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1.4 



1.2 



■ True spectral density 

■ Final positive estimator 



— Regular histogram estimation 





0.1 



0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 



Figure 2: True spectral density /, final positive estimator f HT ~ ~ and estimator via model 

30,jl,9n£j,n 

selection using regular histograms. 



Lemma 4.2 Let n > 1, bj^ := {f,ipj tk ) an d bj tk := (^n> V^fc) f or j > jo an d < k < 2 3 — 1. Suppose 
that Assumptions^ and\^ hold. Then for any x > 0, 

+ 2^11^)+^) <2e-, 



P [\b j)k -b jtk \>2\\f\\ c 



where C* = y ^^P 1 • 
Proof. Note that 



1 n n „ 1 

= ^ E E ( X * " X ) " X )* / e^*-*^ («) da; = — -X T T n ^ k )X\ 



t=i t'=i 



where X = (Xi — X, ...,X n — X^ T , X T denotes the transpose of X and T n (ipj )k ) is the Toeplitz 

l 

matrix with entries [T n (ifj j)k )] tt , = f e i2 ™( t ~ t '')ip jik (w) dw, 1 < M' < n. We can assume without 

o 

loss of generality that E (X t ) = 0, and then under under Assumptions CD and [2j X is a centered 
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Gaussian vector in M. n with covariance matrix E = T n (f). Using the decomposition X = Yi^e, 
where e ~ N(0,I n ), it follows that bj jk = ^^s T Aj,k e , with Aj jk = EaT^^^Ea . Note also that 
^ f^j>fc) = 2wn^ r CA?>)> where tr (A) denotes the trace of a matrix A. 

Now let si, . . . ,s n be the eigenvalues of the Hermitian matrix Aj & with \si\ > > • • • > l s n| 
and let Z = 27rn ( b j;k - E ( J ) = e T A^ k e - tr (Aj t k). Then, for < A < (2|si|) _1 one has that 



n i n +00 -. n +00 

log (e (e AZ )) = x; -a* - 2 lo § a - 2As *) = E E ^ E E y^ x)1 

i=l t=l £=2 i=l fc2 



re 1 

<£-A| ai |--log(l-2A|* 



i=l 

where we have used the fact that — log(l — x) = X^=*l \ ^ot x < I. Then using the inequality 
— u — I log (1 — 2u) < t^- that holds for all < it < |, the above inequality implies that 

" A 2 kl 2 A 2 || S |' 2 



log e (e™ ))<y A|a : , < , 

H V // - ^ l-2A sd ~ l-2A Sl 

i=i 



where ||s|| 2 = Y^=i \ s i\ 2 - Arguing as in Birge and Massart [6], the above inequality implies that for 
any x > 0, P(|Z| > 2|si|x + 2||s|| v / x) < 2e~ x , which implies 

P ($ j)k - E (b j>k ^ | > 2| Sl |~ + 2^v^) < 2e~ x . (4.1) 

Let t (A) denotes the spectral radius of a matrix A. For the Toeplitz matrices E = T n (f) and 
T n (tpj,k) one has that r (E) < ||/||oo and r (T n (ifjj t k)) < \\i>j,k\\<x = 2 J '/ 2 ||V>||oo- These inequalities 
imply that 

= r (E3T n (^, fc )Ea) < r (E) r (T n (^, fe )) < ||/|| 00 2^ 2 ||V;|| 00 . (4.2) 
Let Aj, i = 1., , , .n, be the eigenvalues of T n (ipj^)- From Lemma 3.1 in Davies [10], we have that 

1 

1 1 . ^ f 

limsup— tr (T n (ij}j t k) 2 ) = limsup— >J A 2 = / V?fe (w) = 1, 

i=l Q 

which implies that 

n 

IN* = E N 2 = tr i A h) = tr ((ST n (^, fc )) 2 ) < r (S) 2 tr (T n (^ k f) < \\f\\ln, (4.3) 



i=i 



where we have used the inequality tr ((Ai?) 2 ) < r (A) 2 tr (-B 2 ) that holds for any pair of Hermitian 
matrices A, B. Combining (|4.ip . (j4.2j) and (|4.3p . we finally obtain that for any x > 



E(6^)|>2||/|| 00 ^ + 2^ 2 ||V ; || 00 ^) < 2e"*. (4.4) 
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Now, let & n = 211/Hoo (Vf + 2^||^|| 00 £) + ^, and note that 



> &,n < 



By Lemma 14. H one has that 
which implies using (|4.4p that 



6j )A - E I b jjk 



E 6 



E 6,- 



E lb 



< and thus 



E ( b 



'j,k 



Oj,k 



> t. _ % 



< 2e~ 



which completes the proof of Lemma 



Lemma 4.3 Assume that f G Fp 9 (M) wt/i s > | + | and 1 < p < 2. Suppose that Assumptions 
[H and [3] /io/d. For any n > 1, define jo = jo (n) to oe toe integer such that 2- 70 > logn > 2-? 0-1 , 
and ji = ji (n) to 5e toe integer such that 2 J1 > j-^— > 2 J1_1 . For 5 > 6, tafce toe threshold 



£,j,n — 2 



+ 22 



<5 log n 

oo n 



■ a. 



as in 



, w/iere C* = v/ C ' 2 ^ C ' 1 ■ iei /3 ijfc := 



(f,ipj,k) and %,„,(i,fe) : = (/5j,fcJ u»*/i G as in Ta/ce /3 = (P^ki^eAj and 



n : 



E 



,j,n,(j,k) 



P ~ fa 



U,k)&A h 



E 



. Then there exists a constant M% > such that for all sufficiently large 



£ I" 



Pj,k 



< Mi 



n 



log n 



2s 
' 2s+l 



uniformly over Ff (M). 

Proof. Taking into account that 



E 



P ~ % 



2^0-1 



h 23-1 



5^ E (oj 0)fc - a j(hk ) 2 + E 

k=0 j=j k=0 



+ EE ^ 

i=io fc=o 
:=Ti + T 2 + T 3j 



(4.5) 



we are interested in bounding these three terms. The bound for T\ follows from Lemma 14. II and the 
fact that j = log 2 (logn) < ^pr log 2 (n): 



Ti = Y E ( a io,fe " %o,fe) 2 = O ( — ) < O (n-^TT ) . 
fc=o V n / 

To bound T 2 and T3 we proceed as follows. Write 



(4.6) 



h 2J-1 



>^,n,|6j,fc| > ^ ) + 7 
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and 



31 2^-1 

r s = EE& 



J=jo A:=0 

From Hardle, Kerkyacharian, Picard and Tsybakov [H] we get that 



j=jo k=0 

31 23-1 , 2 

+ 5 EE E U^*"^ fc ) 1 

i=io fc=o 1 

, rj-i/ rj-i// rj-i/// 



T 2 + T 3 < ^ ^ E j {hk - hk) Z \ I (\b jlk \ >|) + EE b hl (\hk\ < 26 



bj,k — bj,k 



> 



3=30 k=0 



Now we bound T'" . Using Cauchy-Schwarz inequality, we obtain 



h 23-1 



T'" <5Y j Y, E 

3=30 k=0 



)j,k ®j,k 



i 

P2 



> 



By the same inequality we get E 









< E 



Il4-/lll 2 ll^,fclll 2 



O E||/ n -/||* 2 . It 



can be checked that E \\I n - /||* 2 < 8E ( \\I n - E (I n )\\l 2 + ||E (I n ) - /||* a ) . According to Comte [7] 



E || J n - E (J n ) llij = O (n 2 ) . From the proof of Lemma ED we get that 
Therefore E \\I n - ff^ < O (n 2 + = O (n 2 ) . Hence E (b jfk - b jtk 



nin)-ft L2 =0[^). 
= 0(E\\I n -ff L2 



O (n 2 ) . For the bound of P ^ bj^ — bj )k > J we use the result of Lemma 14.21 with x = 5 log n, 



where 5 > is a constant to be specified later. We obtain 



p 



bj,k — bj t k 



> 



P 



b; 



'j,k ~ °j,k 



> 2 



5 log n i 

— + 22 



S log n \ C* 



< 2e- 5losn = 2n 



Therefore, for 5 > 6, we get 

31 23-1 



T'" < 5 ^ E 

i=jo fc=o 



i 

P2 



>^f ) o 



n 



-i 



log n 



< O (n 2s+i 



Now we follow results found in Pensky and Sapatinas [19] to bound T' and T" . Let ja be the 

i 

2s+l 



integer such that 2 3A > ^i^-^J > 2-? A 1 (note that given our assumptions jo < ja < jl for 
all sufficiently large n), then T' can be partitioned as T' = T[ + T^, where the first component is 
calculated over the set of indices jo < j < ja and the second component over + 1 < j < ji- Hence, 
using Lemma 14.11 we obtain 



JA 



2-i 



T[<C^2 — = (2 jA n- 1 ) = O 



3=30 



n \ 2s +! 



log n 



■n 



_1 < Oin 



(4.7) 
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To obtain a bound for T^, we will use that if / € F£ (A), then for some constant C, dependent on 
s, p, q and A > only, we have that 

£&?,*< C2" 2 ' V , (4.8) 

fc=0 

for 1 < p < 2, where s* = s + \ — K Taking into account that I (\bj : k\ > %^ ) < 77- |&j,fc| 2 ; we get 
j=w JM) ^ n ( 5 logn + Halloo 5n2»+i (log „)_.+-» j=j A k=o 



<0 2- 2s * jA ) =0 



2s* 

n \ 2s + 1 



log n 

— s ^s-t-i 

where we used the fact that y 5 log n + Halloo ^ n2s+1 (logn) 4s + 2 — >■ +oo when n — > +oo. Now remark 
that if p = 2 then s* = s and thus 

3 = 011 » V^-o/Y V*V (4.9) 



log n y I \V log n 

For the case 1 < p < 2, the repeated use of the fact that if B,D > then I (\bj^\ > B + D) < 
I (\bj,k\ > -B)) enables us to obtain that 

r 2 < £ E E J (Wi > if) ^ ^ E E w p ^.*r j (im~ p < f 2 ii/iioo^ 

i=i A A;=0 V 7 j=j A fc=0 V V 

^EE^6ii/iu^/^y p M'. 



logn 



n 



n \ v n 

i=i A fc=o \ 



Since / G F£ q (A) it follows that there exists a constant C depending only on p, q, s and A such that 



2-?-l 



E \Kk\ P <C2-*i'\ (4.10) 



fc=0 

— a J 

2 V 



where s* = s + \ — \ as before. By (|4.10p we get 



n < (io g n) c(\\f Wo* ap) e E ^^t^iw <c(ii/il,m^^e C2 ~ pjV 



(logn) 1 2 n _ P74S A _^ /V n 



n 2 



2s 
2s+l 



( ^J^ 2 ' PJAS )= \ ) I • 



Hence, by g2D, g^J) and gUJ, T' = O ( ( ^ 



2s 
' 2s + l 
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Now, set ja as before, then T" can be split into T" = T" + T^', where the first component is 
calculated over the set of indices jo < j < ja and the second component over j^ + 1 < j > < j\ . Then 



3 A 2J-1 
3=30 k=0 



n 



5 log n 



n 



n 



Using repeatedly that (B + D) 2 < 2 (B 2 + L> 2 ) for B,D £R, we obtain the desired bound for T[': 



3 A 23-1 



(5 log n 



2 J 



2 5 2 (logn) z 



JA 2^-1 



i=io fc=o 



< (7(11/1^,^^)^2^ +C(||/|| 00 , 5, |HU 



(lQg") 2 2 2j, 



n 

2s 



O ( (logn)2s+i n 2 S +i -|- (log n) 2s +! n 2 S +i J < O 



log n 



2s • 
2s + l 



(4.12) 



31 23-1 

To bound T%, note that T^' < £ £ 6 2 fc = O (2- 2 ^ s ) = O 



n 
log n 



used the condition 



j=3A+l k=0 

Now remark that if p = 2 then s* = s and thus 



2s* 
" 2s+l 



where we have 



T'{ = 



n 



log n 



2s* 
" 2s+l 



o 



n 



log n 



2s ■ 
' 2s+l 



(4.13) 



If 1 < p < 2, 



ii 23-1 

T 'i= E Em 2_p m pj (IM<2&> 

31 23-1 / 

s E E 



<5 log n 



2^" 



5 log re ^ / log n 



n 



2-p 



n 



\bj,k\ 



<(C(||/|| 00 ,||^lloo>^)) 2 " P 



r- x 2-p 2 j_ x 

^> E Em' = o 

i=JA + l fc=0 



n 



2-P 

log re \ - 
n 



2~pjAS 



n 



log n 



O 



n 



log n 



2s ■ 
" 2s+l 



(4.14) 



where we have used condition (|4.10p and the fact that C < ylogn for re sufficiently large, taking 
into account that the constant C* := C (C±, C2) does not depend on n. Hence, by (|4.12|) . (|4.13p and 



2s 
" 2s + l 



Combining all terms in (|4,5p . we conclude that: 



E 



O 



n 



log n 



2s ■ 
" 2s + l 



This completes the proof. 
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Lemma 4.4 Assume that f G F^ q (M) with s > i + | and 1 < p < 2. Suppose that Assumptions^ 

\E and\3 hold. For any n > 1, define jo = jo (n) to be the integer such that 2- 70 > logn > 2 J0 , and 
ji = ji (n) to be the integer such that 2? 1 > j^-^ > 2 J1 ~ . Define the threshold as in fj?. 7p /or 



some 



constants 5 = 6 and b G [| , l) . Xe£ := (/, V'i.fc) A 



/3 jjk ) with (j, k) G Ajj 



as in (EJj. Ta&e /3 = (Pj,k) U)k)eA and ^ = (fe ^ 



■ Then, if \\f - f n 



< 4 

I oo — 4 



and iV n < 7^~j\3 logn' where k is a numerical constant and r is the degree of the polynomials, there 
exists a constant M4 > such that for all sufficiently large n: 



E 




< Ma 



n 



log n 



2s 
2s + l 



uniformly over AM) 



Proof. Recall that /„ is the orthogonal projection of / on the space V n of piecewise polynomials 
of degree r on a dyadic partition with step 2~ Jn . The dimension of V n is N n = (r + 1) 2 Jn . Let 
similarly be the orthogonal projection of I n on V n . By doing analogous work as the one done to 
obtain (|4.5p . we get that 



E 



T 1 + T 2 + T 3 



(4.15) 



where T\ = Yl E ( a jo,k ~ %o,fc) do no * depend on Therefore, by (|4.6p . T% = O In 2s + 1 J . For 
T2 and T3 we have that 



j=jo fc=0 









(&j,fc-&i,fc) < 


K 





< 



and 



J 3,k 



j=jo k=0 

Using the same decomposition as in the proof of Lemma (|4.3p we get that 

31 V-\ 



31 V-\ ( 
T 2 + T 3 < E E E 1 

j=jo k=0 I 
31 23-1 

+ 5 EE E 

i=io fc=o 



I \bj, k \ > 



+ 



J=io fc=0 



/ I \bj, k \ < 



2§.«)} 



> 



_|_ _|_ r j^fff 



Now we bound T"". Using Cauchy-Schwarz inequality, one obtains 



31 v-\ 
T'" < 5 E 

i=io fc=o 



1 

?2 



> 
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From Lemma 14.21 we have that for any y > the following exponential inequality holds: 



P 



■>j,k ~ Oj,k 



> 2 



V 



n 



* + ^ < 2e -y. 



As in Comte [7], let 6 nj , 



11/11 



1 



< b }, with b e (0, 1). Then, using that P 



(4.16) 
> B + D) < 



P 



Jj,k °j,k 



P 



< P 



+ P 



> B ) for B,D > 0, and taking x 



>j,k Uj,k 



> 



bj,k — bj,k 



'j,k "j,k 



> 2 



> 2 



< P 



In 



'j,k Vj,k 



> 2 



8 log n 
(1=6)* 



fn 



, one gets 



+ 2* 



- + 25 WU — 

n n 



e ni6 Jp(e„, 6 ) 



:=PiP(e nib ) + p 2 p(e^ )6 ). 

In Comte [7] is proved that if ||/ — /nlloo < 4 ll/lloo then P (6^ fe J < O (n~ A ) for the choices of 



1 > 6~ Ia/- = 0.841 > I and N n < — — r^r^— > where k = ^ is the numerical constant in the 

— b y ir — 4 " — 36(r+l) iogn' 3d 

hypothesis of our theorem. Following its proof it can be shown that this bound can be improved 

taking k = Ln and b as before (see the three last equations of page 290 in |7J ) . With this selection 

Sb [s) 



of k we obtain that P [Q c n b J < O (n~ 6 ) . Using that P (G„ )6 ) = 0(1) and P 2 = 0(1), it only 
remains to bound the conditional probability P\. On Q n b the following inequalities hold: 



fn 



>(l-b) 



and (6) 



<(! + *) 



(4.17) 



Then, using (|4.17b ) we get 



Pi < P 



> 2 



51ogn +2 , 



Jlogn 



< 2e- <51ogn = 2n' 



n 



where the last inequality is obtained using ()4.16f) for y = <51ogn > 0. Hence, using that 5 = 6, we 



get P 



> Sia ) < o ( n -6). Therefore T'" < C £ E < (n - ^ 1 ) < O ( rT 3 ^ ) . 

3=30 k=0 



1 

Now we bound T . Let j A be the integer such that V*- > f 1^) 2 " +1 > 2 iA ~ x , then V = T[ + T! 2 , 
where the first component is computed over the set of indices jo < j < ja and the second component 
over + 1 < j < j\. Hence, using Lemma |4. II we obtain 



3 A 33-1 2 

i=io fc=o 



logra 



n 



l \ <0( n'^TT 



19 



To bound T^, note that T'^ := T' 2 1 + T' 2 2 , where 

T 2,l= E ^ E i{hk-hk) 2 I (\b,, k \>^f^n, b 



and 



T 2,2 = E E E 7 M > ¥ > @ n. 

j=j A fc=0 I V 



b 



Using that on @ n ^ inequality (|4.17b ) holds and following the same procedures as in the proof of 
Theorem 12,61 we get the desired bound for T' 2 i ■ 



T h < f E E J (Wi > 2 (i - fo ) ii/iioo 2 " w ~ ?)) (418) 



i=jA fc=o 

<- i EE 7 ■ Kk? 



4 n 



j=j,k=o I, f ,|2 L/*]o» + 2 i Woo ^» 



2a* 

n \ 2s +! 



log n 



_^ -s 43 + 1 

where we have used that y 5 log n + Halloo U ~~ ^) 5n 2s + 1 (logn) 4s + 2 — )• +oo when n — > +oo and 
that condition (14. 8p is satisfied. Now remark that if p = 2 then s* = s and thus 



For the case 1 < p < 2, from (|4.18p we have that 



h 23-1 



m / C v — v v-v / , „ „.. / Mlogn i „ , „ Jlogn 



j=JA fc=0 \ \ K ' / / 

<^tE\h k \- p \h k \ p i(\h k r< (nf\LJs^) P ) 

j=j A k=o V V / / 

< (logn)C(||/| L ,^) £ "f M - ( <!2^2-->-) 

rt 2 f-r f— * \ n 2 / 

J=JA fc=0 \ / 

*° ((£)"*)■ 

2a 

where we have used condition (|4.10p . Hence T' 2 1 = O ( ( * 
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Now we bound T' 2 2 . Using Cauchy-Schwarz inequality, we have 



T 2,2 < c e E nph [ im > 2 

3=3A k=0 \ 

h 23-1 



ii 2^-i 



A — a . i — n A — A . l — n V / 



3=3 A k=0 
4~ 



(4.21) 



3=3 A k=0 

where we have used that E j (bj y k — bj,k) \ = O (n 2 ) and that P (o^ < O (n~ 6 ) . Then, putting 



together (|4TT9l) . ()4T20D and (iOTT) . we obtain that T' 2 = 



2s 



log ri 



Now we bound T" . Set ^ as before, then T" = T" +T%, where the first component is calculated 
over the set of indices jo < j < ]a and the second component over ja + 1 < j < ji- Recall that 



8 log n 

WW 



then T{ < T{\ + T[' 2 , where 



JA 2^-1 / 

i=io fc=o \ 

3 A 2J-1 

r "2 = E E ^ P ( 6 JU) > 
i=io fc=o 

where we have used that given nj & inequality (|4,17b ) holds. For T{\ we have 



and 



j'a 2P-1 

T "i = E E ^ ( m 2 ^ 16 

i=io fc=o 



2(1 + 6) ||/|U( 4 /^ + 2i 



./•„ \ /logn 



■;?. 



iA 2^-1 
j=io fc=o 



2(1 + ^)11/1100 

■/A ^ — 1 / / 

E cdi/iioo^) 

A — A„ l — n \ \ 



5 log n 



2~ + 2» „r„oo 



5 logn \ /logn 



(l-6) z n * (l-6) z n 

5 logn i |( ( 2 <5 2 logn \ logn 



(l-6) 2 n + IIY - |lo ° (l-6) 4 n 



+ 



n 



O 



2s ■ 

n \ 2s +! 



logn 



(4.22) 



where we have used repeatedly that {B + D) 2 < 2 (£ 2 + L> 2 ) for all B, D G R. 

To bound T" 2 we use again that P ( 0° b j < O (n~ 6 ) and that condition (I4.8P is satisfied. Then 



3 A 



3 A 2^-1 

T "2 < E E b h n ~ e ^ n ~ e E C2 ~ 2js ' = (n~ & ^ 2jas *) < o (n- 1 ) 



(4.23) 
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Hence, by K22h and (P3j) . T'{ = O 



n 
logn 



2s 
' 2s+l 



. Now we bound Tl[. 



O 



n 



log n 



2s* 
" 2s + l 



where we have used again the condition (|4.8p . Now remark that if p = 2 then s* = s and thus 



T 2 ' = 



log n 



2s 
" 2s + l 



J'l 2^-1 



. For 1 < p < 2, we proceed as follows. 

^V[l^l< 4 ( 2 ( 1 + b )ll/lloofA/¥+ 2 



j=jA+l k=0 
ji 21-1 

31 V-\ ( 

+ E E 6 !fc p \hk\ < 

3=3A+l fc=o \ 

rjitf . m// 

•— J 2,l "I" J 2,2> 



5 I Mil — I + 

00 n 



log n 



n 



*+**I*IL£| + 1 



log n 



n 



where we have used that on @ n< b inequality (|4.17b ) holds. Now we bound T'^ 1 . 



^ E E 8(1 + 6)11/11^ 

3=3A+1 k=0 \ \ 



log n 



n 



5\ogn 2 4 H^lloo^logT 
{l-b) 2 n {l-bfn 



<C(||/lloo,M,,r n, x 



,2-p 



log n 



2-p j-j 23-1 



n 



E Em p <o 

J=JA+1 fc=0 



n y 

log n \ J 
n 



\bj,k\ 



2~P3A S 



O 



log n\ 2 

n 



log n 



Kf±izi)\ 

2s+l 
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log n 



2s ' 
' 2s+l 



(4.24) 



where we have used that condition (|4.10p is satisfied. To bound T£ 2 we use again that P [® c n b j < 
O (n -6 ) and that condition (|4.8p also holds. Then, from (|4.23p we get 

Th< E E^ P ( e W^ n " 6 E C2- 2 ^ = O (n- 6 2-^*) < O (n- 1 ) . (4.25) 
i=iA+i fc=o j'=j'a+i 



Hence, by @21 and 025]), T" = O ( ( ^ 



that: 



2s 
' 2s + l 



Combining all terms in (|4.15p . we conclude 



E 



O 



n 



logn 



2s ■ 
" 2s + l 
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This completes the proof. ■ 



4.2 Proof of Theorem 1231 

First, one needs the following proposition. 

Proposition 4.5 Let f3j & = (f,tpj s k) an d Pj t k = (I n ,ipjk) with (j,k) E A^. Suppose that f S 
F^ q (M) with s > 1/p and 1 < p < 2. Let Mi > be a constant such that Mf 1 < / < M\ (see 
Lemma\2J§. Let e h = 2Mfe 2 ' r n +1 D jl A jl . If e h < I, then there exists 0? £ R* A h such that: 

Uhfi^i^hk) = (f^j,k) = Pj,k for all (j,k) € 

Moreover, the following inequality holds ( approximation error) 
Mi 



Suppose that Assumptions^ and\^hold. Let r\j x<n = 4M 2 e 27 ^" 1 " 2 ^!" 1 " 2 ^ . Then, for every A > 
such that A < Vj in there exists a set fl n ,l of probability less than M2\~ l , where M2 is the constant 
defined in Lemma \4-l\ such that outside the set fi n 1 there exists some 9 n G M* Aj i which satisfies: 



fj u e n ^3,k) = (^nj^i.fe) = Pj,k for al1 U, k ) G A ji- 
Moreover, outside the set f2 n 1, the following inequality holds (estimation error) 



Proof. Approximation error: Recall that = (/, "0j,fc) an d let /3 = (Pj,k)(j k)eX- " 

fine by = ^ Qj,k^j,k an approximation of 5 = log (/) and let /3 ,(i,fc) = \ fji,9 h ,i>j,k 
U,k)eA n x 

{exp(g jl ) ,ip j>h ) with 0^ = (%,fe)(j )fe ) e A il and ^° = ( j8 o.O»)(i J fc)eA j ' 0bserve tnat tlie coefficients 
/%,fc — Po,(j,k)i (i>^) G Aj i; are the coefficients of the orthonormal projection of / — /j^- onto Vj. 



Hence by Bessel's inequality, 
and Sheu [2J, we get that: 



< 



. Using Lemma [24] and Lemma 2 in Barron 



A)||2< / (f-fji&J d i x<M 1 
f 



< Mie 



/ log 



/ fji ,6j 1 
1 

( f X 



fhfin 



dp, 



9ji\\L 2 



Then, one can easily check that b = e^i^C^'i'^i)!,*,) < Mie 7 ^. Thus the assumption that < 1 
implies that the inequality — P0W2 — M\e lj ^Dj 1 < 2 beA ls satisfied. Hence, Lemma [2.41 can 
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be applied with 9q = 9j 1 , P = P and b = exp y log yfj 1 ,e Jl J J, which implies that there exists 
9* h = 9 (/3) such that (fj u e^ , fpj,k) = Pj,k for all (j, k) G A^. 

By the Pythagoriandike relationship (|2.2p . we obtain that A (f; fj lt e* ^ < A (f; fj^e^j ■ Now we 
use a result wich states that if / and g are two functions in L2QO, 1]) such that log ( ) is bounded. 



9 lo f (fo§ (9)) c ^ i ' wnere ^ denotes the Lebesgue measure on [0, l].(see 



ThenA(/; 5 ) < ±ell 

Lemma A.l in Antoniadis and Bigot [T]). Hence, it follows that 

\ 2 



A(/;/ il)0| J<- e 



1 

2 

Mi 



log 



/ log 



/ 



Mi 



which completes the proof for the approximation error. 

Estimation error: Applying again Lemma E3 with 6> = 9* j± , Po,(j,k) = (fji,e ,^j,k) = Pj,k, P = P, 



where /3 = ( Pj^ 



(i.fc)eA Jl 



and 6 = exp 



lo g [fh 



we obtain that if 



< 



2 2ebA h 



with 



P = (/ 3 i,fe)( iifc ) 6Aj then there exists 9 n = 9 \B\ such that \f jli o n ,'4'j,k) = Pj,k for all (j,k) G A^ 



Hence, it remains to prove that our assumptions imply that the event P — P 



< 



2 - 2ebA n 



holds 



with probability 1 — M2A . First remark that b < Mie 7j i +<Ej i and that by Markov's inequality and 



Lemma 14.11 we obtain that for any A > 0, P 



P-P 



> \*±>X \ < 1 n_E 



P-P 



< M 2 X~ l . 



Hence, outside a set f2 n 1 of probability less than M2A 1 then 



P-P 



< A — — . Therefore, the 

2 n 



condition 



P-f>\ 2 ± 2iO- holds if { Xt ¥)~ 2 < mX-> which is equivalent to 4e 2 & 2 A 2 i A^ < 1. 
This last inequality is true if rj jun = 4M 1 2 e 27 « +2t ft +2 4% < ±, using that b 2 < M 2 e 2 ^i +2e n . 
Hence, outside the set f2 n i, our assumptions imply that there exists 9 n = 9 (p ) such that 



fj! ^j,k) = Pj,k fo r all (j, k) G Aj t . Finally, outside the set f2 n ,i, by using the bound given in 
Lemma 12.41 one obtains the following inequality for the estimation error 

A (f hr ; 4 s ) < 2Mie^ + ^ +1 A^l. 

which completes the proof of Proposition 14.51 ■ 

Our assumptions on ji(n) imply that 2 n2s+1 < 2 n ^ n ' < n 2s + 1 . Therefore, using Lemma 12.21 one 
has that for all / G F^ 2 {M) with s > 1/2 

l h {n) < Cn^v, A jl(n) < Cn^+n, D h(n) < CrT^ , 

where C denotes constants not depending on g = log(/). Hence, lim e^r n \ = lim 2M 2 e 27:, i ( ' l)+1 yl 7 - ,( n \D~ i n \ 
0,uniformly over F^iM) for s > 1/2. For all sufficiently large n, ^j 1 ( n ) < 1 and thus, using Propo- 
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sition ETol there exists S* M G M* A hW such that 

A (/; /,„• , ) < ye^(«)^ (n) < Cn"^ for all / G F| j2 (M). (4.26) 

By the same arguments it follows that lim ri h ( n \ n = lim 4M?e 27j i<"l +2£3 i("> +2 A 2 , * — 3J ^- = 0, 
uniformly over i^^M) for s > 1/2. Now let A > 0. The above result shows that for sufficiently large 
n, X < Vj^n) n i an d thus using Proposition 14.51 it follows that there exists a set O n) i of probability 
less than M2\~ 1 such that outside this set there exists 9 n E which satisfies: 

A [f h{ n),e Un y f jl(n)> $ n ) < 2M ie ^(")^(") +1 M 2 A^M < CXn-^, (4.27) 
for all / G -F| 2 (M). Then, by the Pythagorian-like identity (|2.2p it follows that outside the set f2 n i 

A ( /; 4(n),0 = A (^ii(«).«* W ) + A (^iW' e ; i( n) ; 4(n),5n) ' 

and thus Theorem 12.51 follows from inequalities (|4.26p and (|4.27p . 

4.3 Proof of Theorem IX6l 

First, one needs the following proposition. 

Proposition 4.6 Let (3j ; k := (f,ipj,k) and ^ jn ,(j,k) := „ (/3j,fc^ wrtt/i (J, k) G Aj 1 . Assume that 
f G F* (A) with s > 1/p and 1 < p < 2. Let Mi > be a constant such that Mf 1 < / < M\ (see 
Lemma\2J$. Let e h = 2M 1 2 e 27 « +1 D il i 3l . If e h < 1, then there exists d* h G R* A n such that: 

\fn,0* h ^j,k) = (f^j,k) = Pj,k f or al1 G Ajj 

Moreover, the following inequality holds ( approximation error) 



2s 

Suppose that Assumptions^ and\^hold. Let rj jun = 4M 1 2 e 27 n +2e n +2 A 2 i (l^) " +1 • Then > f or 

every A > such that A < vjj in there exists a set VL n p of probability less than M%\~ 1 , where M3 

is the constant defined in Lemma such that outside the set f2 nj 2 there exists some 9 n G IR# A -?i 
which satisfies: 

(ffJ n ,^ t J^,k) = \Pj,k) = for all (j,k) G A h . 

Moreover, outside the set Q n> 2, the following inequality holds (estimation error) 

2s 

n \ 2s +! 



A ; ff I f ) < 2M ie %H« A 

\ J n jifinfy.nj ylogny 
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Proof. Approximation error: The proof is the same that the one of Proposition 14.51 
Estimation error: Applying Lemma El with 8 = 0* h , Po,(j,k) = (fj u e ,^j,k) = Pj,k, P = % n , 



where 



(i,*)eA fl 



, and b = exp I log I f jlfi * 



we obtain that if 



< 



2EFA- with P = (%)(j,fc) ( .V ' there exists "■■ (%,nj such ^/''.^ , •'••.;./■' / ■■ J t,.„. (,./-■! 
for all (j, A;) G . 



, n ~P 



2 - 2ebA n 



holds 



Hence, it remains to prove that our assumptions imply that the event 

with probability 1 — M3A . First remark that b < Mie^^n and that by Markov's inequality and 
Lemma 14.31 we obtain that for any A > 0, 



P 



2 V log n 



2s ■ 
" 2s+l 



< 



1 



n \ 2H- 1 



A V log n 



E 



P^n-P 



< 



M 3 



n \ 2 S +i 



A \ log n 



n 



logn 



2s 
" 2s + l 



< MsA" 1 . 



Hence, outside a set O n> 2 of probability less than M3A , it holds that 



Pti.n-P 



Therefore, the condition 



lent to Ae^Al\(^- n 



%,n " P 



< i 

2 - 2eM ii 



holds if A 



log n 



2s \ 2" 

2*+l \ < 1 



2 — yogn 



2s 
' 2s + l 



2ebA h ' 



which is equiva- 



2s 
' 2s + l 



< 1. Using that b 2 < Mfe 27 ^" 1 " 2 ^! the last inequality is true if 
Hence, outside the set f2 nj 2, our assumptions imply that there exists 9 n = 9 (/% n ) such that 



f a c >V , j,fc) = Pi nijk) f° r an (j)^) £ A?:r Finally, outside the set fi n ,2> by using the bound 
given in Lemma \2A\ one obtains the following inequality for the estimation error 



n 



logn 



2s 
" 2s+l 



which completes the proof of Proposition 14.61 ■ 

Our assumptions on ji(n) imply that ^IBgn — 2 Jl( - n ) < j^^- Therefore, using Lemma 12.21 one 
has that for all / G F^ q (M) with s > 1/p, 



7?i (n) 



< c 



n 



log n 



n 



log n 



n 



log n 



where C denotes constants not depending on g = log(/). Hence, 
lim e jl{n) = lim 2M 1 2 e 2 ^(») +1 A Jl(n) D Jl(n) = 0, 

uniformly over FpJM) for s > 1/p. For all sufficiently large n, e 3 - x f n ) < 1 and thus, using Proposition 
SI there exists 0* , , G M #A «(™) such that 



A (f'Jji(n), 



?) 



log n 



-2s* 



for all / G F; JM). 
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Now remark that if p = 2 then s* = s > 1 (by assumption), thus 



2 s 



If 1 < p < 2 then one can check that condition s > | + - implies that 2s* > 2 s+i ' hence 



H'iWJf^ J J- <*»> 

By the same arguments it holds that 

lim V h [n),n = l™ 4M 1 2 e 2 (^W + ^W+ 1 )4 w (-^-) ^ = 0, 

n— s>+oo J v ; ' n-H-oo JiV'V y log 77, / 

uniformly over FJ}JM) for s > 1/p. Now let A > 0. The above result shows that for sufficiently large 
n, A< 77 7 '. . , and thus using Proposition 14.61 it follows that there exists a set £l n 2 of probability 

less than MsA -1 such that outside this set there exists 9 n E which satisfies: 



2 s 



A (f M n),e* , /jfT, * , ) < 2M 1 e^wHw+ 1 A f-^- > ) ' S+1 (4.29) 
\ J1V '* ji(n) ]\{n),v n ,t,j, n J ylogny 

for all / £ Fp q (M). Then, by the Pythagorian-like identity (|2.2p it follows that outside the set f2„ 2, 

A ( /;/ f(n)A, Cj ,J = A (^ ; 4(n),e; i(n) ) + A (4(n),e; i(n) ;/f ( ^^ i€j J , 

and thus Theorem 12.61 follows from inequalities (|4.28p and (|4.29p . 
4.4 Proof of Theorem 12.71 

The proof is analogous to the one of Theorem 12.61 It follows from Lemma 14.41 
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